Overview and conditions of access
2025-10-01
Integrated data: - When we add non survey data to survey data
Whether part of the original data collection or not
Whether primary or secondary
Whether same unit of analysis or not
Validation or enhancement (Benzeval et al 2020)
Typically administrative records, measured data, social media data
Examples include accelerometer data, genetic data, individual NHS data, social security data..
This talk deals with integrated data available at the UK Data Service mostly
Partly depends on the kind of data linked to surveys
… And scientific teams that performed the linkage ie research based vs
Major longitudinal studies
A few large scale cross-sectional government surveys
The largest UK longitudinal study
Initial sample size: 40K households, 100K individuals
14 yearly waves so far: 2009-2023; includes BHPS data 1991-2002
Ethnic minority boost samples; Innovation Panel
Very wide range of topics covered:
Administrative records
ie data collected by a public ie the state controlled authority: government department, the NHS
Health: NHS SHS: medical records ie in/outpatient attendance hospital episodes, maternity
Education: DofE, National Pupil Database, school attendence; school profile/teacher survey; distance to grammar school; student loan data, OFSTED data
Pollution; green space deciles; PAYE data
Social media/Digital trace
Polygenic scores (PGI) about health and social outcomes
Gene combinations associated with probability of certain outcomes
45 traits: ie health outcomes and behaviour; mental health and personality traits; Social outcomes
Available on the Birth Cohorts and Next Steps datasets
Subsamples limited to ‘Europeans’ from a genetic perspective
OFSTED ‘State of the nation’: anonymised data on latest schools inspections outcomes of 22,000 open schools
Linked with the MCS, currently covers years 2005 to 2019
Data on a wide range of topics. such as:
Main employer pensions scheme for UK employees
Covers 1,000,000 employers, 11 millions employees
Linked to consenting Understanding Wave 11 respondents (about 12,000)
Data about: